IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events AN I-VECTOR BASED APPROACH FOR AUDIO SCENE DETECTION
نویسندگان
چکیده
The IEEE-ASSP Scene Classification challenge on user-generated content (UGC) aims to classify an audio recording that belongs to a specific scene such as busystreet, office or supermarket. The difficulty of scene content analysis on UGC lies in the lack of structure and acoustic variability of the data. The i-vector system is state-ofthe-art in Speaker Verification and Scene Detection, and is outperforming conventional Gaussian Mixture Model (GMM)-based approaches. The system compensates for undesired acoustic variability and extracts information from the acoustic environment, making it a meaningful choice for detection on UGC. This paper reports our results in the challenge by using a hand-tuned i-vector system and MFCC features. Compared to the MFCC+GMM baseline system, our system increased the classification accuracy by 26.4% to about 65.8%. We discuss our approach and highlight parameters in our system that showed to significantly improved our classification accuracy.
منابع مشابه
IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events AN EXEMPLAR-BASED NMF APPROACH FOR AUDIO EVENT DETECTION
We present a novel, exemplar-based method for audio event detection based on non-negative matrix factorisation (NMF). Building on recent work in noise robust automatic speech recognition, we model events as a linear combination of dictionary atoms, and mixtures as a linear combination of overlapping events. The exemplarbased dictionary is created by extracting all available training data, artif...
متن کاملIEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events RECOGNISING ACOUSTIC SCENES WITH LARGE-SCALE AUDIO FEATURE EXTRACTION AND SVM
This work describes our contribution to the IEEE AASP Challenge on classification of acoustic scenes. From the 30 second long highly variable recordings, spectral, cepstral, energy and voicing-related audio features are extracted. A sliding window approach is used to obtain statistical functionals of the low-level features on short segments. SVM are used for classification of these short segmen...
متن کاملCapturing the Acoustic Scene Characteristics for Audio Scene Detection
Scene detection on user-generated content (UGC) aims to classify an audio recording that belongs to a specific scene such as busy street, office or supermarket rather than a sound such as car noise, computer keyboard or cash machine. The difficulty of scene content analysis on UGC lies in the lack of structure and acoustic variability of the audio. The i-vector system is state-of-the-art in Spe...
متن کاملDeep Sequential Image Features for Acoustic Scene Classification
For the Acoustic Scene Classification task of the IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE2017), we propose a novel method to classify 15 different acoustic scenes using deep sequential learning, based on features extracted from Short-Time Fourier Transform and scalogram of the audio scenes using Convolutional Neural Networks. It is the first time...
متن کاملIEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events MULTIRESOLUTION AUDITORY REPRESENTATIONS FOR SCENE CLASSIFICATION
Here, we propose a framework that provides a detailed analysis of the spectrotemporal modulations in the acoustic signal, augmented with a discriminative classifier using support vector machines. We have seen that such representation is successful at capturing the nontrivial commonalties within a sound class and differences between different classes[1, 2, 3].
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013